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Abstract. We provide a simple algorithm for finding the optimal 
upper bound for sums of products of matrix entries of the form 

n 

ker J > TT 

where some of the summation indices are constrained to be equal. 
The upper bound is easily obtained from a graph associated to 
the constraints, tt, in the sum. 

o 

a 

g 1. Introduction 

We want to consider sums of the form 

^ N 

^ (1) S4N):= J: e*g....«<'"' 



_ ^ ilr--j2m=l 

Cn kerj>7r 



(N 

On where — {tlj^)fj^i (k — 1, . . . ,m) are given matrices and tt is a par- 

^ tition of {1,2,..., 2m} which constrains some of the indices ji, . . . , j2m 

^ to be the same. 

^ The formal definition of this is given in the foUowing notation. 

• ^ 

^ Notation 1. 1) A partition vr = {Vi, . . . , Vr} of {1, ... , A;} is a decom- 

^ position of {!,..., A;} into disjoint non-empty subsets V^; the Vi are 

called the blocks of tt. The set of all partitions of {1, . . . , A;} is denoted 
byP(A;). 

2) For vr, (T G 'P{k), we write tt > cr if each block of tt is a union of 
some blocks of a. 

3) For a multi-index j = (ji, . . . , j^) we denote by ker j G V{k) that 
partition where p and q are in the same block if and only if jp — jg. 
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Thus, for a given partition vr G V{k)^ the constraint ker j > vr in ([T]) 
means that two indices jq and jp have to agree, whenever q and p are 
in the same block of vr. Note that we do not exclude that more indices 
might agree. 

The problem which we want to address is the optimal bound of the 
sum (jT|. One expects a bound of the form 

m 

(2) |^^(iV)| <iV'^W J]||Tfc||, 

k=l 

for some exponent t(7r), where ||T|| denotes the operator norm of the 
matrix T. The question is: what is the optimal choice of this exponent? 

Our interest in sums of the form St^IN) was aroused by investigations 
on random matrices where such sums show up quite canonically, see |3] . 
Indeed, when one considers the asymptotic properties of products of 
random and deterministic matrices, one has to find efficient bounds for 
the sums, S'^(A^), of products of entries of the deterministic matrices in 
order to determine their contribution to the limiting distribution. Yin 
and Krishnaiah |3] , working on the product of random matrices, already 
faced this problem and obtained the first results for some special cases; 
a more systematic approach was given by Bai [1]. Our investigations 
are inspired by the presentation in the book of Bai and Silverstein [2]. 

A first upper bound comes from the trivial observations that we 
have in Sj^lN) one free summation index for each block of tt and that 

\tlf\ < llTfcll for all and thus one clearly has ^ with r(7r) = |7r|, 
where |7r| the number of blocks of tt. However, this is far from optimal. 
The main reason for a reduction of the exponent comes from the fact 
that some of the indices which appear are actually used up for matrix 
multiplication and thus do not contribute a factor of A^. For example, 
for cT = {(2, 3), (4, 5), ■ ■ ■ , (2m, 1)} one has 



N 



ilv .i2m = l 
J2 =i3 ,j4 =i5 , ■ ■ ■ j'2m = Jl 

N 

,(2) _ _ (m) 

iii2 'his imh 

= Tr(ri---T„), 

thus 

m 

\S^{N)\<N\\T^---Tj\<Nl[\\n 

k=l 
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Hence the trivial estimate r(cr) = m can here actually be improved to 
x{a) = 1. 

Other cases, however, might not be so clear. For example, what 
would one expect for 

(3) r = {(l),(2,4,ll),(3,5,10),(6,7,8) 

(9, 12, 14, 16, 20), (13, 15, 17, 18), (19, 22, 24), (21, 23)}. 

The corresponding sum Sr is 

N 



E 



(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) 
jij2 jsji jsje jrjs iMio jiiji2 iiajM jisjie jnjis jwj-io 321322 j2-j.j2i 



il,--- J24 = l 

subject to the constraints 



' 32 = JA = 


ill, 




js = jb = 


jio, 




J6 = jr = 


3s, 




< 39 = jl2 = 


= Jl4 = 


- JI6 — J20, 


3l3 = 3l5 


= Jl7 


= jl8, 


3l9 = 322 


= J24, 





J21 = J23 

or, in terms of unrestricted summation indices: 

N 

(A) Q = \^ fW +(2) .(3) ,(4) (5) ,(6) ,(7) ,(8) ,(9) ,(10) ,(11) ,(12) 

V / / ^ n«2 «3«2 «3«4 *4«4 «5«3 *2«5 «6«5 «6«5 «6«6 iyis «8«7 «8*7 ■ 

n,*2v,«8 = l 

The trivial estimate here is of order N^, but it might not be obvious 
at all that in fact the optimal choice is r(r) = 3/2. The non-integer 
value in this case shows that the problem does not just come down to 
a counting problem of relevant indices. 

We will show that there is an easy and beautiful algorithm for deter- 
mining the optimal exponent r(7r) for any tt. Actually, it turns out that 
r(7r) is most easily determined in terms of a graph which is associ- 
ated to TT as follows. We start from the directed graph Gq^^ with 2m 
vertices 1, 2, ... , 2m and directed edges (2, 1), (4, 3), ... , (2m, 2m — 1). 
(This is the graph which corresponds to unrestricted summation, i.e., 
to TT = 02m, where 02m is the minimal partition in P(2m) with 2m 
blocks, each consisting of one element. The reason that we orient our 
edges in the apparently wrong direction will be addressed in Remark 
|2}) Given a vr G P(2m) we obtain the directed graph (7,^ by identifying 
in 6*02™ vertices which belong to the same blocks of tt. We will not 
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Figure 1. The graph for the sum Q. 



identify the edges (actually, the direction of two edges between iden- 
tified vertices might be incompatible) so that G-,^ will in general have 
multiple edg well as loops. 

For example, the graph Gr for r from Equation ^ is given in Figure 
[T| It should be clear how one can read off the graph Gr directly from 
Equation 

The optimal exponent r(7r) is then determined by the structure of 
the graph G^^. Before we explain how this works, let us rewrite the sum 
([T]) more intrinsically in terms of the graph G = G^^ as 

(5) 



Sg{N) := E n ^ 



*t(e) )*s{e) ' 



::V-^[N] eeE 



We sum over all functions i : V —>■ [N] where N = {1, 2, 3, ... , N}, V 
is the set of vertices of G, E the set of edges, and s(e) and t{e) denote 
the source vertex and the target vertex of e respectively. Note that we 
keep all edges through the identification according to vr, thus the m 
matrices Ti , . . . , in ([T]) show up in ([s]) as the various Tg for the m 
edges of G.,,. 

Remark 2. Note that a factor of tf^^^ in the sum in (5) produces an 
edge labelled T; starting at a vertex labelled is and ending at a vertex 
labelled ir- 

Ti 



O 



o 
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This reversing of the indices is an artifact of the usual convention of 
writing TS for the operator where one first apphes S and then T. 

Clearly vr and G^^ contain the same information about our problem; 
however, since the bound on Sg{N) is easily expressed in terms of G^, 
we will in the following forget about vr and consider the problem of 
bounding the graph sum Sg{N) in terms of for an arbitrary graph 
G with attached matrices. We will call a picture as in Figure [T] a graph 
of matrices; for a precise definition, see Definition [8j 

Example 3. In the figures below we give four directed graphs and 
below each graph the corresponding graph sum. One can see that if 
the graph is a circuit then the graph sum is a trace of the product of 
the matrices. However for more general graphs the graph sum cannot 
easily be written in terms of traces. Nevertheless, as Theorem |6] will 
show, there is a simple way to understand the dependence of the graph 
sum on A^, the size of the matrices. 




T 



o 
i 



T 



3 



i,3 




Tt{T,T2T,) = J2 



''ij ''jk ''ki 




.{l)+(2).(3) 



hj ^jk ^ji 



i,j,k,l 



The relevant feature of the graph is the structure of its two-edge 
connected components. 

Notation 4. 1) A cutting edge of a connected graph is an edge whose 
removal would result in two disconnected subgraphs. A connected 
graph is two-edge connected if it does not contain a cutting edge, i.e., 
if it cannot be cut into disjoint subgraphs by removing one edge. A 
two- edge connected component of a graph is a subgraph which is two- 
edge connected and cannot be enlarged to a bigger two-edge connected 
subgraph. 

2) A forest is a graph without cycles. A tree is a component of a 
forest, i.e., a connected graph without cycles. A tree is trivial if it 



6 



JAMES A. MINGO AND ROLAND SPEICHER 




Figure 2. The quotient graph diGr) of Figure [T] the 
forest here consists of just one tree. 



consists of only one vertex. A leaf of a non-trivial tree is a vertex 
which meets only one edge. The sole vertex of a trivial tree will also 
be called a trivial leaf. 

It is clear that if one takes the quotient of a graph with respect to 
the two-edge connectedness relation (i.e., one shrinks each two-edge 
connected component of a graph to a vertex and just keeps the cutting 
edges), then one does not have cycles any more, thus the quotient is a 
forest. 

Notation 5. For a graph G we denote by ^iG) its forest of two- edge 
connected components: the vertices of '^{G) consist of the two-edge 
connected components of G and two distinct vertices of d{G) are con- 
nected by an edge if there is a cutting edge between vertices from the 
two corresponding components in G. 

For the graph from Figure [ijthe corresponding forest 5'(Gr) is drawn 
in Figure [2] 

Now we can present our main theorem on bounds for sums of the 
form ([5]). In the special case of a two-edge connected graph we obtain 
the same bound as appears in the book of Bai and Silverstein [2]. In 
the general case, however, our bound is less than that of |2]. 

Theorem 6. 1) Let G be a directed graph, possibly with multiple edges 
and loops. Let for each edge e of G be given an N x N matrix Tg = 

^^if^i!j='^' ^ ^' i"^spectively, be the edges and vertices of G 
and 
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Figure 3. Putting the non-cutting edge matrices equal 
to the identity matrix reduces the problem for Gr of Fig- 
ure [T] to this one. 

where the sum runs over all functions i : V [N] . Then 
(7) \Sg{N)\<N<''^-1[\\T4, 

where t(G') is determined as follows from the structure of the forest 
d{G) of two- edge connected components ofG: 

r(G) = ^(0 

I leaf of 5(G) 

where 

if I is a trivial leaf 
if I is a leaf of a non-trivial tree 

2) The bound in Equation is optimal in the following sense. For 
each graph G and each N E N there exist N x N matrices T^ with 
||Te|| = 1 for all e E E such that 

SciN) = N<^\ 

Example 7. Consider again our example Sj- from (|4]). Its forest ^(G-r), 
given in Figure [2| consists of one tree with three leaves; thus Theorem 
|6j predicts an order of N^^"^ for the sum (|4]). In order to see that 
this can actually show up (and thus give the main idea for the proof 
of optimality), put all the matrices in Figure [l] for the non-cutting 
edges equal to the identity matrix; then the problem collapses to the 
corresponding problem on the tree, where we are just left with the four 
indices ii, 12, 14,17 and the three matrices Ti, T3, Tiq. See Figure |3j 
The corresponding sum is 



t(0 : = 



N 

«1,«2,«4,«7 = 



8 



JAMES A. MINGO AND ROLAND SPEICHER 



Let V now be the matrix 



/I 








1 



1 







8 



V 



Vi 







0/ 



and put T3 = V^, Ti = Tio = V. Then ||Ti|| = IIT3II = ||Tio|| = 1 and 



Note that each tree of the forest d{G) makes a contribution of at 
least 1 in t(G'), because a non-trivial tree has at least two leaves. One 
can also make the above description more uniform by having a factor 
1/2 for each leaf, but counting a trivial leaf as actually two leaves. (The 
reason for this special role of trivial leaves will become apparent in the 
proof of Theorem [6] in the next section.) Note also that the direction 
of the edges plays no role in the estimate above. The direction of an 
edge is only important in order to define the contribution of an edge to 
the graph sum. One direction corresponds to the matrix Tg, the other 
direction corresponds to the transpose T^. Since the norm of a matrix 
is the same as the norm of its transpose, the estimate is the same for 
all graph sums which correspond to the same undirected graph. 

Finally, we want to give an idea of our strategy for the proof of The- 
orem |6} One of the main steps consists in modifying the given graph 
of matrices (by reversing some orientations, and by splitting some ver- 
tices into two) in such a way that the corresponding sum Sg{N) is 
not changed and such that the modified graph has the structure of an 
input-output graph. By the latter we mean that we have a consis- 
tent orientation of the graph from some input vertices to some output 
vertices, see Definition [TOj 

For example, a suitable modification of the graph Gr is presented 
in Figure |4| We have reversed the orientation of two edges (but com- 
pensated this by taking the adjoint of the attached matrices) and also 
split each of the vertices i^, i^, iq into two copies. To take care of the 
fact that in the summation we must have ^4 = 24 we have added an 
additional edge between and i'^ with the identity matrix attached 
and similarly for and and ie and i'^. So in order to obtain a bound 
for Sr it suffices to obtain a bound for the graph G from Figure |4j But 
this has now a kind of linear structure, with ^4 as input vertex and ii 
and ig as output vertices. This structure allows us to associate to the 



we have for this case 




]V3/2_ 
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Figure 4. A modification of tlie grapli Gr from Figure 
[T] in input-output form. Note that the input vertex 
and the output vertices ii and ig are chosen from the 
leaves of ^{Gr)- 

graph G an operator Tc which is described in terms of tensor products 
of the maps Te and partial isometrics describing the splittings at the 
internal vertices. Tq maps from the vector space associated to ^4 to the 
tensor product of the vector spaces associated to ii and It is then 
fairly easy to see that the norm of Tq is dominated by the product of 
the norms of the involved operators Te, and the estimate for the sum 
Sg{N) is finally just an application of the Cauchy-Schwarz inequality, 
where each of the input and output vertices gives a factor N^^"^. 

The rest of the paper is organized as follows. In Section |2j we for- 
mulate a slight generalization of our theorem to rectangular matrices 
and introduce abstractly the notion of a graph of matrices. Section |3] 
deals with input-output graphs and the norm estimates for their asso- 
ciated operators. In Section |4| we address the general case by showing 
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how one can modify a general graph of matrices to become an input- 
output graph. Finally, in Section |5} we generalize the considerations 
from Example [7] to show the optimality of our choice for r(G'). 

2. Generalization to Rectangular Matrices 

Let us first formalize the input information for Theorem |6] We will 
deal here with the more general situation of rectangular instead of 
square matrices. In order for the graph sum to make sense we require 
that for a given vertex v all the matrices associated with an incoming 
edge have the same number of rows, N^j and likewise all the matrices 
associated with an outgoing edge have the same number of columns Ny. 
Moreover we shall find it advantageous to treat the matrices as linear 
operators between finite dimensional Hilbert spaces. So for each vertex 

V let Hy = C^" have the standard inner product and let {^1, . . . , ^n^} 
be the standard orthonormal basis of C^". Note that we use the con- 
vention that inner products (x, y) 1— > (x, y) are linear in the second 
variable and we shall use Dirac's bra-ket notation for rank one opera- 
tors; \0{v\ (/^) = {v,fJ')^- 

Definition 8. A graph of matrices consists of a triple & = {G, (7i^,)^,gy, 
(Te)ee£;) in which 

i) G = {y,E) is a directed graph (possibly with multiple edges 
and loops), 

a) Tiy is a finite dimensional Hilbert space equal to C^" and 
in) Te : T-is{e) '^t(e) is a linear operator. 
(This is also known as a representation of a quiver, but we shall not 
need this terminology.) 

Here is the generalization of Theorem [6] to the case of a rectangular 
matrices. 

Theorem 9. Let <& = (G, (7it,)^ey, (^e)ee-E) be a graph of matrices. 
Let 

(9) Si(5):= J2 Il(^H^^y'^e^^sJ■ 

i:V-^N ei^E 

where the sum runs over all functions i : V —>■ N such that for each 

V E V we have 1 < i{v) < Ny. 

Let 5 = diG) be the forest of two-edge connected components of G. 
Then 

(10) \S{<&)\< n (maxdimT^J ■ J] H^-H' 

leaf [ of 5- ^ ^ ^ ee-B 
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where, for a leaf I, v runs over all vertices in the two edge connected 
component of G corresponding to I, and where 

f 1, if I is a trivial leaf 

1 1, if I is a leaf of a non-trivial tree. 

3. Estimate for Input-Output Graphs 



The main idea for proving the estimate (10) for a graph of matrices 
is to first suppose that there is a flow from some vertices designated 
input vertices, \4n, to some other vertices designated output vertices, 
Vout, and then to show that every graph can be modified to have such 
a flow. All the remaining vertices, which are neither input nor output 
vertices, will be called internal vertices. 

Definition 10. Let G be a directed graph (possibly with multiple 
edges). We say that G is an input-output graph if there exists two 
disjoint non-empty subsets, V^n and Kut, of the set of vertices of G 
such that the following properties are satisfied. 

o G does not contain a directed cycle. (Recall that a cycle is 

a closed path and that a path is directed if all the edges are 

oriented in the same direction.) 
o Each vertex of G lies on a directed path from some vertex in 

V^in to some vertex in Vo^f 
o Every internal vertex has at least one incoming edge and at 

least one outgoing edge, 
o Every input vertex has only outgoing edges and every output 

vertex has only ingoing edges. 

Recall that is an orthonormal basis for Tiy. Let Vq C be a 

subset, suppose that we have a function i : Vq ^ N such that i{v) < Ny 
then for each f G Vq, is an element of our orthonormal basis of Tiy. 
Thus an element of our orthonormal basis of <S)veVo specified by 
a function i : Vq — >■ N such that i{v) < Ny for each v in Vq. When 
it is clear from the context we shall just say that a basis element of 
^v&Vo specified by a function z : Vq ^ N, but it should always 

be understood that i{v) < Ny. 

Hence if we form {®v&Vo^iv}i where i runs over all functions z : Vq — >■ 
N, we obtain an orthonormal basis of <S)v&Vo '^^^^ operator 
T& ■■ <S>veVin ^ *H)t„eyout specified by giving 



12 



JAMES A. MINGO AND ROLAND SPEICHER 



for each basis vector i : V^n — * N and j : V^ut — > N. In the theorem 
below we shall show that a certain kind of graph sum can be written 
in terms of a vector state applied to an operator defined by the inner 
product above. This is the first of two key steps in proving Theorem [9j 

Theorem 11. Let & = (G, (Hv)vev, (^e)ee£) be a graph of matrices 
and assume that G is an input-output graph with input vertices V^^ and 
output vertices Vout- 

1) We define : (g),^^!^ ^ '^weVout ^» 

(11) ( (g) e,»,Te( (g) e.J):= E n(^'^*(e)'^<=^'^.(e))' 

where i : Via N, j : Vout —>■ N and k runs over all maps k : V N 
such that k\vi^^ = i and /clvout = J- 
Then we have 



(12) llTell < n 11^^ 



eS-B 

2) For the graph sum ^ we have 

Sii3) = { (g) r,Te (g) D- 

■weVout veVin 

where = + • • • + G 'Hv, and we have the estimate 
(13) \Sm< n dim(K)'/'-nil^^ll- 

Proof. The key point is to observe that we can write the operator as 
a composition of tensor products of the edge operators Tg and isometrics 
corresponding to the internal vertices. Every internal vertex has, by 
the definition of an input-output graph, some incoming edges and some 
outgoing edges, let's say t incoming and s outgoing (with t,s > 1). 
Then the summation over the orthonormal basis of Tiv for this internal 
vertex corresponds to an application of the mapping L„ : 7i®* — > TCf^ 
given by 

In terms of our basis we have for all 1 < ii, . . . ,it < Ny 




iiii = ■ ■ ■ = it 
otherwise. 



The mapping Ly is, for all internal vertices v, a partial isometry, and 
thus has norm equal to 1. 
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It remains to put all the edge operators and the vertex isometries 
together in a consistent way. For this, we have to make sure that we can 
order the application of all these operators in a linear way so that their 



composition corresponds to the operator defined by (11). However, 
this is guaranteed by the input-output structure of our graph. We can 
think of our graph as an algorithm, where we are feeding input vectors 
into the input vertices and then operate them through the graph, each 
edge doing some calculation, and each vertex acting like a logic gate, 
doing some compatibility checks. The main problem is the timing of 
the various operations, in particular, how long one has to wait at a 
vertex, before applying an operator on an outgoing edge. In algorithmic 
terms, it is clear that one has to wait until all the input information 
is processed; i.e. one has to wait for information to arrive along the 
longest path from an input vertex to the given vertex. 

To formalize this, let us define a distance function d : V ^ {0,1,2, . . .} 
on our graph G which measures the maximal distance from a vertex to 
a input vertex, 

there exists a directed path of length 
k from some input vertex to v 



d{v) := max < k 



The length of a path is the number of edges it uses. Note that since 
an input vertex has no incoming edges, we have d{v) = for all input 
vertices. The number d{v) tells us how long we should wait before we 
apply the isometry corresponding to v; after d{v) steps all information 
from the input vertices has arrived at v. Let r be the maximal distance 
(which is achieved for one of the output vertices). The distance function 
d gives us a decomposition of the vertices V of our graph into disjoint 
level sets 

r 

Vk:={veV\ d{v) = k}, V=[j Vk. 

k=0 

Note that, for any edge e, we have d{t{e)) > d{s{e)) + 1. In order to 
have a clearer notation it is preferable if our edges connect only vertices 
which differ in d exactly by 1. This can easily be achieved by adding 
vertices on edges for which this difference is bigger than 1. The new 
vertices have one incoming edge and one outgoing edge. We have of 
course also to attach matrices to those edges, and we do this in such a 
way that all incoming edges of the new vertices get the identity matrix, 
the original matrix Tg is reserved for the last piece of our decomposition. 
These new vertices will not change the operator nor the graph sum 
S{&). In the same way we can insert some new vertices for all incoming 
edges of the output vertices and thus arrange that every output vertex 
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V has maximal possible distance d{v) = r. (Note that there cannot 
be a directed path from one output vertex to another output vertex, 
because an output vertex has no outgoing edges.) 

Thus we can assume without loss of generality that we have d(t{e)) = 
d{s{e)) + 1 for all edges e & E and that d{v) = r for all v e Vout- We 
have now also a decomposition of E into a disjoint union of level sets, 

r 

Ek:^{eeE\ d(t{e)) ^ k}, E ^ [j E^. 

k=l 

Edges from Ek are connecting vertices from Vk-i to vertices from Vfc. 

Note that our Hilbert spaces correspond on one side to the vertices, 
but on the other side also to the edges as source and target Hilbert 
spaces; to make the latter clearer, let us also write 

where of course Ti^ is the same as Hs{e) and is the same as 'Ht{e)- 
We can now write 

(14) — Lj. ■ Tj. ■ Lr-i ■ Tj—i ■ ■ ■ Li ■ Ti • Lq, 

where is the tensor product of all partial isometries corresponding 
to the vertices on level k, and is the tensor product of all edge 
operators corresponding to the edges on level k. More precisely. 



is defined as 



whereas 



Tk := (g) Te; 

veVk 



with the vertex partial isometry 



given by 



eeE feE 
t{e)=v s{f)=v 



i=l 

where s and t are the number of edges which have v as their source 
and target, respectively. 
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Since we do not have incoming edges for v G Vin nor outgoing edges 
for V G Vont, one has to interpret Lq and Lr in the right way. Namely, 
for V G Vin, the operator Ly acts on 



s{e)=v 



given by 



and similarly for v G Vont- (Formally, one can include this also in 
the general formalism by adding one incoming half-edge to each input 
vertex and one outgoing half-edge to each output vertex.) With this 
convention, the product given in (14) is an operator from <S)v£V 
'^■weVout clear that (14) gives the same operator as (11). 



Now the factorization (14) and the fact that all and thus all Lk 



are partial isometrics yield 



iT^ii < n ii^'^ii • n ii^'^ii = n n^^^n = n 



A:=0 



k=l 



k=l 



This is the norm estimate (12) claimed for the operator 



In order to get the estimate for the graph sum S{(5) we have to 
note the difference between and S{(3): for Tg^ we sum only over the 
internal vertices and thus remain with a matrix, indexed by the input 
and output vertices; for S{(3) we also have to sum over these input and 
output vertices. If we denote by 



1=1 

the sum over the vectors from our orthonormal basis of 7i^,, then we 
have 

W&VovLt Vin 

An application of the Cauchy-Schwartz inequality yields then 

\sm<\\T4. n iini- n ii^'^ii- 



WGVou 



Since the norm of is, by Pythagoras's theorem, given by (dimTY^)^/^, 
we get the graph sum estimate (13). □ 
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4. Proof of the General Case 

Let us now consider a graph of matrices as in Theorem [9| The prob- 
lem is that the underlying graph G might not be an input-output graph. 
However, we have some freedom in modifying G without changing the 
associated graph sum. First of all, we can choose the directions of the 
edges arbitrarily, because reversing the direction corresponds to replac- 
ing Te by its transpose T*. Since the norm of Te is the same as the norm 
of T* the estimate for the modified graph will be the same as the one 



for the old graph. More serious is that, in order to apply Theorem 11 
we should also remove directed cycles in G. This cannot, in general, 
be achieved by just reversing some directions. (As can clearly be seen 
in the case of a loop.) The key observation for taking care of this is 
that we can split a vertex v into v and v' and redistribute at will the 
incoming and outgoing edges from v between v and v'. We put one 
new edge / between v and v' with the corresponding operator Ty being 
the identity matrix. The constraint from Tf in the graph sum will be 
that after the splitting, the basis vector for the vertex v has to agree 
with the basis vector for the vertex v' , so summation over them yields 
the same result as summation over the basis of Ti^ before the splitting. 
Thus this splitting does not change the given graph sum. Since the 
norm of the identity matrix is 1, this modification will also not affect 
the wanted norm estimate. 

One should of course also make sure that the forest structure of the 
two-edge connected components is not changed by such modifications. 
For the case of reversing arrows this is clear; in the case of splitting 
vertices the only problem might be that the new edge between v and v' 
is a cutting edge. This can actually happen, but only in the case where 
V constitutes a two-edge connected component by itself. In that case, 
we do the splitting as before but add two new edges between v and v', 
both with the same orientation and both with the identity operator. 

This motivates the following definition of the modification of a graph 
of matrices. 

Definition 12. We say that iS = {G, (?T!^)„gy, (Te)ggg) is a modifica- 
tion of (25 = (G, {Hy)v£Vy iTg)e<^E), if the former can be obtained from 
the latter by finitely many applications of the following operations: 

o change the direction of the arrow of an edge e and replace : 
Hs{e) 'Ht{e) by its transpose T* : Ht^e) ^s(e) 

o split a vertex v into two vertices v and v', redistribute in some 
way the incoming and outgoing edges for v together with their 
matrices to v and v' and add a new edge between v and v' 
with arbitrary direction for this edge and the identity matrix 
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attached to it; should f be a two-edge connected component, 
then we add two edges between v and v' , both with the same 
orientation, and both having the identity matrix attached to 
them 

Our discussion from above can then be summarized in the following 
proposition. 

Proposition 13. Let G5 = (G, {'Hw)^^^^ i'^f)feE) modification of 
(S = (G, (7it,)„gy, (Te)e6£;). Then we have: 
o the graph sums are the same, 

s{e) = s{&y, 

o the forests of two-edge connected components are the same, 
o the product of the norm of the edge operators is the same, 



Thus, in order to show the graph sum estimate (10) for (S it is enough 
to prove this estimate for some modification Gi5. 

So the crucial step for the proof of Theorem |9] is now to modify a 
given graph G to an input-output graph G with the right number of 
input and output vertices. 

Proposition 14. Let & he a graph of matrices. Then there exists a 
modification such that the underlying graph G of the modification is 
an input-output graph. 

Furthermore, the input and output vertices can he chosen such that: 
for each non-trivial tree of the forest ^{G){= d{G)) we have one leaf as 
input leaf and all the other leaves as output leaves. For a trivial tree, 
the trivial leaf is considered hoth as input and output leaf. The input 
vertices of G shall consist of one vertex from each input leaf, and the 
output vertices shall consist of one vertex from each output leaf. 

Proof. Clearly we can assume that the underlying graph G of is 
connected, because otherwise we do the following algorithm separately 
for each connected component. 

For such a connected G, consider the tree of its two-edge connected 
components. Declare arbitrarily one leaf as input leaf all the other 
leaves as output leaves; if the tree is trivial, we declare its only leaf 
both as input and output leaf. Furthermore, we choose an arbitrary 
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vertex from the input leaf as input vertex, and for our output vertices 
we choose an arbitrary vertex from each output leaf. The direction 
from input leaf to output leaves defines uniquely a flow in our tree 
from the input leaf to the output leaves, i.e., this gives us a direction 
for the cutting edges of G. 

For each two-edge connected component we define now one input 
vertex and one output vertex. For the input leaf we have already 
chosen the input vertex; its output vertex is the source vertex of one 
(arbitrarily chosen) of the outgoing cutting edges. For the output leaves 
we have already chosen their output vertices; as input vertex we take 
the target vertex of the (unique) incoming cutting edge. For all the 
other, non-leaf, components we choose the target vertex of the (unique) 
incoming cutting edge as input vertex and the source vertex of one 
(arbitrarily chosen) of the outgoing cutting edges as the output vertex. 
We want all those input and output vertices to be different, which can 
be achieved by splitting, if necessary, some of them into two. 

So now each two-edge connected component has one input vertex and 
one output vertex. If we are able to modify each two edge connected 
component in such a way that it is an input-output graph with respect 
to its input and output vertex, then by putting the two-edge connected 
components together and declaring all input vertices but the one from 
the input leaf and all output vertices but the ones from the output 
leaves as internal indices, we get the modification G with the claimed 
properties. It only remains to do the modification of the two-edge 
connected components. This will be dealt with in the next lemma. □ 

Lemma 15. Let & be a graph of matrices and assume that the un- 
derlying graph G is two-edge connected. Let v and w be two disjoint 
vertices from G. Then there exists a modification (S of&, such that the 
underlying graph G of the modification is an input-output graph, with 
input vertex v and output vertex w. 

Proof. The proof of this can be found in p] Ch. 11]. Let us recall 
the main steps. One builds a sequence Gk of input-output graphs (all 
with V as input vertex and w as output vertex) such that each step 
is manageable and that the last graph is the wanted one. For this 
construction we ignore the given orientation of the edges of G, but 
will just use the information from G as undirected graph; then we 
will choose convenient orientations for the edges when constructing the 
sequence Gk- 

First, we choose a simple path (i.e., a path without cycles), in our 
graph G from v to w. We direct all edges on this path from v to w. 
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This path with this orientation of edges is our first input-output graph 

Assume now we have constructed an input-output graph Gk- If this 
is not yet the whole graph, then we can choose an edge e which is not 
part of Gk and which has one of its vertices, say x, on Gk- Let us 
denote the other vertex of e by Then one can find a simple path 
in G which connects z with Gk and does not use e. (This is possible, 
because otherwise e would be a cutting edge.) Denote the end point 
of this path (lying on Gk) by y. (Note that y might be the same as z.) 
We have now to direct this path between x and y. \i x ^ y, then there 
was: 

i) either a directed path from x to y in Gk, in which case we direct 
the new path also from x and y; 

ii) or a directed path from y to x in Gk, in which case we direct 
the new path also from y and x; 

Hi) or there was no such path in Gk, in which case we can choose 
any of the two orientations for the new path between x and y. 

(Note that the first and second case cannot occur simultaneously, be- 
cause otherwise we would have had a directed cycle in Gk-) 

The only problematic case is when x = y, i.e., when the new path is 
actually a cycle. In this case we split the vertex x = y into two different 
vertices, x and y; x gets all the incoming edges from Gk and y gets all 
the outgoing edges from Gk, and the new edge is directed from x to y- 
Furthermore, the new cycle becomes now a directed path from x to y- 

Our new graph Gk+i is now given by Gk (possibly modified by the 
splitting of X into x and y) together with the new path from x to y. It 
is quite easy to see that Gk+i is again an input-output graph, with the 
same input vertex and output vertex as Gk- 

We repeat this adjoining of edges until we have exhausted our origi- 
nal graph G, in which case our last input-output graph is the wanted 
modification. □ 

5. Proof of Optimality 

In order to show the second part of Theorem |6| that our exponent 
t{G) is optimal, we just have to adapt the corresponding considerations 
in Example [7] to the general case. For a given graph we attach to each 
non-cutting edge the identity matrix; thus all indices in a two-edge 
connected component of G get identified and we reduce the problem to 
the case that G is a forest. Since it suffices to look on the components 
separately, we can thus assume that G is a tree. If this tree is trivial, 
then we have no cutting edges left and we clearly get a factor A^. 
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Otherwise, we put an orientation on our tree by declaring one leaf 
as input leaf and all the other leaves as output leaves. Then we attach 
the following matrices to the edges of this tree 

{V^, if e joins the input leaf with an internal vertex 
V, if e joins an output leaf with an internal vertex , 
1, otherwise 

where V is the matrix given in Again, it is straightforward to 
see that this choice forces every index corresponding to an internal 
vertex to be equal to 1, whereas there is no restriction for the indices 
corresponding to the leaves; taking into account also the 1 / \/N factors 
from the operators V, we will get in the end jv*'*'^^'''^/^ for the sum. 
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